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Abstract:  In  this  paper,  a general  methodology  for  remaining  useful  life  estimation  based 
an  indirect  methodology  is  presented.  Gearbox  failure  data,  recorded  using  a mechanical 
test  bed  at  the  Applied  Research  Laboratory,  Penn  State  University,  is  used.  The  machine 
remaining  useful  life  estimation  method  used  in  this  paper  is  indirect  method,  in  the  sense 
that  it  predicts  first  the  behavior  of  some  system  parameters  known  to  be  sensitive  to  the 
machine  operating  status,  use  those  predicted  values  in  order  to  find  the  predicted 
machine  status  through  the  fuzzy  system  definitions,  and  then  estimate  the  remaining 
useful  life  by  measuring  the  time  from  the  present  time  to  the  time  where  the  death  status 
was  detected.  Some  machine  parameters  such  as  temperature,  vibration  spectrum  and 
level,  and  acoustic  emission,  are  used  in  such  analysis.  Machine  operating  regions  are 
divided  into  normal  operation,  abnormal  operation,  and  no  operation  or  death.  Every 
parameter  limits  is  defined  in  each  region.  Prediction  models  are  used  to  predict  the  time 
trajectory  of  the  machine  parameters  starting  from  some  history  measurements.  Those 
predicted  trajectories  could  be  used  to  determine  the  machine  death  status  point  in  time. 
The  remaining  time  to  death  can  be  estimated  form  such  models  within  some  appropriate 
certainty  and  error  tolerance.  Neural  networks  and  fuzzy  logic  system  modeling 
techniques  are  used  for  machine  parameter  prediction  due  to  their  known  ability  for  non- 
linear system  modeling,  robustness,  generalization,  and  modeling  decision  uncertainty. 

Key  Words:  Decision  making;  diagnosis;  fuzzy  logic;  maintenance;  neural  networks; 
prediction;  prognosis;  remaining  useful  life;  vibration  analysis. 

Introduction:  Machine  remaining  useful  life  of  running  machinery  is  very  important 
information  if  known  within  a certain  confidence  level  and  tolerance.  If  machine 
remaining  useful  life  is  known  with  some  certainty  and  within  some  acceptable  tolerance 
they  can  be  used  in  potential  system  planning  [1,2].  That  type  of  planning  will  lead  to 
more  efficient  production,  less  down  times,  less  inventory  size,  cost  saving,  and  smooth 
system  upgrade.  If  a machine  death  time  is  known  within  a certain  acceptable  error  limits, 
an  early  planning  can  be  made  to  have  a replacement  in  time,  which  might  lead  to  a big 
saving  in  cost,  an  appropriate  selection  for  installation  time,  and  avoidance  to  sudden 
machine  breakdown. 

However,  prediction  is  one  of  the  hardest  problems  to  solve  especially  for  non-linear  and 
chaotic  systems  [3,4],  Most  of  the  real  life  systems  belong  to  non-linear  and  chaotic 
systems.  Even  though  prediction  cannot  be  achieved  with  high  accuracy  for  such  systems, 
but  for  very  short  time  in  the  future,  knowing  something  about  the  future  is  important 
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even  if  not  very  accurate.  For  example,  weather  forecasting  can  be  achieved  with 
reasonable  accuracy  only  for  the  coming  few  days.  However,  it  is  also  important  to 
predict  weather  for  the  next  weeks,  months  and  years  even  with  very  low  certainty  and 
very  high  prediction  error. 

Literature  has  focused  some  attention  in  the  past  few  years  for  finding  techniques  for 
estimating  machine  remaining  useful  life  (RUL)  [1,5].  This  problem  is  still  in  need  for 
some  extra  efforts  in  the  coming  years,  to  come  up  with  improved  models  and  methods. 
Most  of  the  RUL  estimation  methods  are  based  on  direct  methods  that  use  some  history 
of  machine  measurements  in  order  to  directly  estimate  the  machine  remaining  useful  life 
or  time  to  death  [5],  In  this  paper,  machine  RUL  will  be  assumed  to  be  the  remaining 
time  to  death.  And  death  will  be  defined  as  the  time  when  the  machine  will  be  no  longer 
useful,  which  can  be  due  to  a major  defect  in  the  machine,  very  low  efficiency  operation, 
the  machine  becoming  out-dated,  or  machine  becoming  impossible  to  operate.  The 
machine  RUL  method  presented  in  this  paper  is  an  indirect  method  that  is  based  on 
prediction  of  the  future  time  trajectory  of  some  machine  parameters.  Those  parameters 
are  correlated  to  the  machine  different  operating  status  regions.  Knowing  the  correlation 
of  the  deviation  of  some  parameters  from  some  nominal  value  to  the  machine  status,  and 
the  parameter  predicted  deviation  in  a specific  time  would  lead  to  good  knowledge  of  the 
remaining  time  to  reach  such  operating  status.  Neural  network  prediction  models  in 
conjunction  with  some  fuzzy  logic  based  decision-making  algorithms  are  use  to 
implement  this  indirect  methodology. 

Neural  network  parameter  prediction-models  are  used  due  to  their  ability  for  non-linear 
system  modeling,  and  generalization  [6,7],  Fuzzy  logic  operating  region-locators  are  used 
due  to  their  ability  to  model  uncertainty  and  continuous  logic  variables  in  real  world 
problems  [8-10]. 

Machine  Failure  Data  [1 1]:  The  gearbox  failure  data  used  in  this  paper  are  obtained 
through  the  Applied  Research  Laboratory  (ARL),  at  Penn  State  University.  The  data  was 
recorded  at  the  ARL  using  a MDTB  (Mechanical  Diagnostic  Test  Bed)  that  is 
functionally  a motor-driven-train-generator  test  stand.  The  gearbox  is  driven  at  a set  input 
speed  using  a 30  Hp,  1750-rpm  AC  (drive)  motor,  and  the  torque  is  applied  by  a 75  Hp, 
1750  rpm  AC  (absorption)  motor.  The  maximum  speed  and  torque  are  3500  rpm  and  225 
ft-lbs,  respectively.  The  speed  variation  is  accomplished  by  varying  the  frequency  to  the 
motor  with  a digital  vector  drive  unit.  A similar  vector  unit  capable  of  controlling  the 
current  output  of  the  absorption  motor  accomplishes  the  variation  of  the  torque.  The 
MDTB  has  the  capability  of  testing  single  and  double  reduction  industrial  gearboxes  with 
ratios  from  about  1.2:1  to  6:1.  The  gearboxes  are  nominally  in  the  5-20  Hp  range.  The 
system  is  sized  to  provide  the  maximum  versatility  to  speed  and  torque  settings.  The 
motors  provide  about  2 to  5 times  the  rated  torque  of  the  selected  gearboxes,  and  thus  the 
system  can  provide  good  overload  capability. 

Ten  accelerometers  and  an  acoustic  microphone  are  placed  on  the  test  bed.  The 
microphone,  placed  in  proximity  to  the  test  bed,  provides  a frequency  range  up  to  22  kHz, 
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which  is  almost  twice  the  bandwidth  of  human  audible  range.  A total  of  32 
thermocouples  are  available  for  temperature  readings  on  the  MDTB.  The  highest 
sampling  speed  required  was  20  kilo  Samples  (kS)/s  for  the  accelerometers  and  44.1  kS/s 
for  the  microphone.  The  thermocouples  are  sampled  at  1 S/s. 

Methodology:  The  machine  remaining  useful  life  (RUL)  estimation  methodology 
developed  in  this  paper  is  based  on  a machine  parameter  prediction  technique  along  with 
knowledge  about  the  different  operating  regions  of  the  machine.  In  this  method,  it  is 
assumed  that  the  operating  status  of  a specific  machine  is  reflected  into  clear  changes  in  a 
set  of  its  parameters,  such  as  vibration,  temperature,  current,  voltage,  power,  speed,  etc. 
These  define  the  machine  state  trajectory  in  a multidimensional  space.  Those  parameters 
that  are  most  sensitive  to  the  machine  operating  status  should  be  selected  for  the  analysis. 
Mapping  of  the  machine  state  trajectory  to  individual  two-dimensional  subspaces  is  used 
in  order  to  simplify  the  analysis.  Certainly,  a prior  analysis  to  the  machine  and  its 
parameters,  and  their  correlation  to  the  change  of  operating  status  are  needed. 

Three  operating  status  regions  are  assumed  Normal  Operation  (Health),  Abnormal 
Operation  (Sickness),  and  Death  (no  operation,  or  non-useful  operation),  as  shown  in 
Figure  1.  In  reality  there  is  no  sharp  changes  between  those  regions  and  the  borderlines 
plotted  on  the  graph  are  artificial  borderlines  to  approximate  the  different  operating 
regions.  A more  realistic  representation  was  developed  using  fuzzy  logic  description 
methods.  This  representation  is  illustrated  in  Figure  2.  This  fuzzy  membership  function 
representation  allows  easier  handling  of  the  terminology,  continuous  representation  of 
logical  functions,  smooth  transition  of  status,  and  easier  decision-making  process.  This 
fuzzy  representation  for  machine  status  is  an  integral  part  of  the  machine  RUL  estimation 
process  developed  in  this  paper.  However,  the  actual  measurements  are  used  to  tune  those 
fuzzy  membership  functions  to  each  type  of  machine  separately. 

When  a machine  is  in  a normal  operating  status,  it  is  guaranteed  that  all  of  its  parameters 
will  be  bounded  in  a specific  region.  This  region  will  be  very  narrow  in  the  first  operating 
period,  and  will  be  in  a close  proximity  with  the  rated  values  of  the  machine  parameters. 
However,  it  will  become  wider  as  the  machine  becomes  older.  This  transition  will  happen 
gradually.  Some  parameters  will  increase  while  others  will  decrease  due  to  the  machine 
degradation  process.  Examples  for  the  accelerated  degradation  trajectories  reflected  into 
vibration  information  measured  by  accelerometers  mounted  on  the  MDTB,  or  what  can 
be  called  a machine  state  transition  on  two-dimensional  maps  are  shown  in  Figures  3 and 
4.  Figure  3 shows  the  root  mean  square  (rms)  value  of  the  machine  vibration  versus  time 
in  seconds,  measured  during  run  #10  using  accelerometer  # 2.  And  Figure  4 shows  the 
rms  value  of  the  machine  vibration  versus  time  in  seconds,  measured  during  run  #10 
using  accelerometer  # 5.  These  transitions  are  driven  by  the  actual  internal  physical 
changes  in  the  machine,  which  can  take  place  in  any  similar  active  system,  such  as  any 
rotating  machinery.  For  the  same  type  of  machine,  some  units  may  experience  an 
increasing  trend  of  some  of  their  parameters  while  others  experience  a decreasing  trend, 
due  to  their  unique  manufacturing  and  operating  conditions.  However,  this  increase  or 
decrease  in  itself  may  not  correlate  to  the  machine  operating  status  as  long  as  it  is 
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bounded  within  certain  limits.  In  other  words  the  relative  change  in  a specific  parameter 
is  more  indicative  of  the  machine  operating  status  than  its  absolute  value.  This  is  why 
those  operating  regions  were  generated  based  on  the  deviation  from  the  baseline  value 
that  indicates  the  machine  condition  at  its  birth  (when  it  first  came  online).  Some  machine 
maintenance  may  also  create  a sudden  shift  of  the  machine  parameter  trajectory  from  one 
operating  region  to  another,  such  as  from  the  abnormal  to  the  normal  region,  and  needs  to 
be  taken  into  account  during  the  analysis. 
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Figure  1.  Graphical  representation  of  the  definitions  for  machine  operating  status  regions. 


Figure  2.  Fuzzy  membership  function  definitions  for  machine  operating  status  regions. 
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Accelerometer  A02  t Run  10 


Figure  3.  The  rms  value  of  the  machine  vibration  versus  time  in  seconds,  measured 
during  run  #10  using  accelerometer  # 2. 


Accelerometer  A05  /Run  10 


Figure  4.  The  rms  value  of  the  machine  vibration  versus  time  in  seconds,  measured 
during  run  #10  using  accelerometer  # 5. 
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When  a machine  parameter  transition  is  recorded  in  that  manner,  machine  monitoring  and 
diagnosis  can  be  performed  using  those  actual  online  measurements.  Machine  monitoring 
is  very  useful  for  many  operation  and  maintenance  applications.  The  remaining  useful  life 
estimation  is  crucial  to  many  other  planning  and  maintenance  considerations.  The 
machine  remaining  useful  life  estimation  will  be  based  not  on  the  measured  parameter 
value  but  on  the  predicted  parameter  value  from  some  measured  history  data.  The  state 
trajectory  of  the  predicted  parameter  transition  will  indicate  when  the  machine  move  from 
one  operating  region  to  another.  Now  the  problem  has  been  simplified  to  a 
straightforward  prediction  problem.  Even  though  prediction  of  non-linear  systems  is  very 
hard  to  achieve,  it  is  hoped  that  some  appropriate  prediction  models  will  be  built  and 
improved  with  time.  Those  models  will  use  some  history  values  in  order  to  predict  future 
values.  Linear  systems  are  the  easiest  to  predict,  where  few  history  points  are  enough  to 
predict  long  time  in  the  future.  Unfortunately,  linear  systems  almost  do  not  exist  in 
practice,  and  prediction  problem  becomes  one  of  the  most  challenging  problems  to  solve. 
Some  non-linear  systems  though  are  predictable  within  limits  and  with  variable 
prediction  errors.  Chaotic  systems,  which  are  a category  of  nonlinear  systems,  are  not 
predictable  due  to  their  sensitive  dependence  on  initial  conditions  [3,4,12].  Meaning  that 
a minute  change  in  the  initial  operating  point  might  lead  to  a completely  different  time 
trajectory,  which  makes  the  prediction  problem  for  such  systems  almost  impossible  to 
solve,  at  least  in  the  time  domain. 

It  is  assumed  that  the  systems  under  discussion  in  this  paper  are  non-linear  and  are  not 
chaotic.  This  means  that  such  systems  are  predictable  within  limits  and  with  some 
prediction  error,  based  on  the  nature  of  the  system  and  the  prediction  method  used.  The 
prediction  time  step  is  also  a factor  in  the  prediction  error.  Prediction  time  step  is  decided 
based  on  the  nature  of  the  system  and  the  solution  method  used.  Generally,  one  time  step 
can  be  predicted  with  very  high  accuracy.  One  time  step  prediction  may  give  a prediction 
trajectory  that  is  very  comparable  to  the  actual  trajectory.  However,  the  prediction  extent 
in  that  case  is  very  limited,  only  one  time  step  in  the  future  which  might  not  be  very 
useful  in  case  of  prediction  of  machine  remaining  useful  life.  In  case  of  prediction  of 
machine  RUL  iterative  prediction  can  be  used.  In  the  iterative  prediction  scheme,  every 
predicted  point  is  added  to  the  previous  history  points  as  if  it  was  an  actual  point  and  used 
to  predict  the  next  future  point.  If  one-step  prediction  is  used,  a very  small  prediction 
error  is  expected,  but  when  iterative  prediction  is  used  the  error  is  multiplied  every  time 
prediction  is  repeated,  which  will  create  a big  deviation  of  the  predicted  trajectory  from 
the  actual  trajectory.  This  deviation  is  expected  to  grow  more  with  larger  prediction  time. 
Certain  confidence  level  or  certainty  in  the  prediction  and  consequently  in  the  RUL 
prediction  needs  to  be  established.  For  example,  if  this  model  predicts  the  machine  RUL 
is  time  (t)  then  a certainty  (C)  for  that  decision  needs  to  be  provided  to  the  user,  in  order 
for  that  information  to  be  useful  for  practical  applications.  This  certainty  will  be 
formulated  as  a function  of  the  accurate  prediction  probability  and  the  degree  of  fuzzy 
membership  of  the  operating  status  on  which  the  decision  was  made.  The  accurate 
prediction  probability  will  be  computed  using  two  methods.  The  first  method  assumes  a 
uniform  probability  distribution,  meaning  that  the  one  step  prediction  is  achieved  with  the 
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same  probability  anywhere  in  the  operating  spectrum.  In  this  case  the  total  iterative 
prediction  probability  of  accurate  prediction  can  be  computed  as: 

P'=(P'Y  (1) 

Where  P‘  is  the  total  accurate  prediction  probability  and  P1  is  the  one  step  accurate 
prediction  probability. 

The  second  method  assumes  that  the  distribution  of  the  prediction  probability  is  changing 
over  time,  and  the  total  iterative  prediction  probability  can  be  computed  as: 

P'=P'P2APn  (2) 

Where  P' , P2 ,A  P"  are  the  accurate  prediction  probabilities  at  time  steps  1,  2, ....  n. 

The  degree  of  fuzzy  membership  of  any  system  parameter  is  estimated  using  the  fuzzy 
membership  function  definitions  similar  to  those  shown  in  Figure  1 . A simple  rule  base 
will  be  used  to  decide  the  operating  status  of  the  machine  at  any  point.  After  plugging  the 
different  parameter  values  into  that  fuzzy  system,  a decision  will  be  made  about  the  status 
of  the  system.  This  decision  is  a fuzzy  set,  which  results  from  the  fuzzy  system 
inferencing  process  that  involves  both  implication  of  individual  rules  and  aggregation  of 
the  collective  rules.  Defuzzifying  this  output,  a crisp  number  reflecting  its  degree  of 
membership  to  a specific  operating  region  will  be  given.  That  number  (Zdcath),  estimated 
using  the  fuzzy  output  membership  functions,  along  with  the  accurate  prediction 
probability  ( P‘ ) defined  above  will  be  used  to  generate  a total  certainty  level  in  the 
decision  as  follows: 


C = zdMhP'  (3) 

And  the  machine  estimated  RUL  would  be  computed  as: 

RUL  = nAt  (4) 

Where  n is  the  number  of  points  predicted  until  a death  region  was  located,  and  At  is  the 
prediction  time-step. 

In  addition  to  the  certainty  in  that  decision,  an  estimated  error  margin,  or  tolerance,  needs 
to  be  provided,  and  that  will  be  computed  as  an  error  bar  around  the  estimated  RUL  as: 

RUL  = nAt±nAtcrc  (5) 

Where  erc  is  the  estimated  standard  deviation  of  the  iterative  prediction  at  the  current 
estimation  point. 


233 


A more  conservative  estimate  can  be  computed  as: 


RUL  = nAt  ± «A/(1  - C)  (6) 

And  a less  conservative  or  more  optimistic  estimate  can  be  computed  as: 

RUL  = nAt±nM(\-Qar  (7) 

Prediction  Models:  There  are  several  methods  in  the  literature  for  non-linear  system 
prediction  [3,6,8,10,12],  Some  of  those  are  based  on  time  series  prediction  [13];  others 
are  based  on  multiple  input  single/multiple  output  non-linear  system  modeling  [6,8], 
Neural  network  models  are  the  easiest  and  fastest  to  build  in  addition  to  many  other 
advantages  such  as  robustness,  generalization,  learning,  and  model  free  estimation 
[6,8,10],  Neural  networks  are  capable  of  modeling  non-linear  systems  [7,13],  Neural 
networks  were  adopted  before  for  time  series  prediction  and  multiple  input 
single/multiple-output  system  modeling  [6,7,13],  The  neural  networks  multiple-input 
single-output  models  will  best  suit  the  problem  in  hand.  Since  the  RUL  estimation  deals, 
most  of  the  time,  with  dynamic  systems  and  components,  dynamic  neural  network  are 
preferred  over  static  neural  networks,  for  such  applications,  due  to  their  ability  for 
modeling  of  system  time  behavior.  Therefore  recurrent  neural  networks  are  used  to 
predict  the  machine  state  trajectory  for  the  problem  in  hand,  especially  that  this  type  of 
neural  networks  is  known  to  be  capable  of  modeling  system  time  behavior. 


Conclusion:  A general  methodology  was  developed  for  estimation  of  machine  remaining 
useful  life  using  history  data.  This  method  is  indirect  method  that  starts  with  defining 
parameters  sensitive  to  the  machine  operating  regions  and  transitions,  defining  some 
fuzzy  operating  regions,  and  then  building  prediction  models  for  those  parameters.  If 
machine  parameters  time  trajectory  can  then  be  predicted  with  some  known  accuracy,  and 
a fuzzy  logic  decision  making  system  can  detect  the  machine  operating  status  with  some 
certainty,  then  an  estimate  for  the  machine  remaining  useful  life  can  be  computed  with  a 
known  certainty  and  a known  tolerance.  In  this  method,  neural  network  models  are  used 
to  predict  the  future  trajectory  of  the  machine  parameters  from  some  measured  history 
data  with  some  estimated  probability  of  success.  Neural  network  models  are  adopted  due 
to  their  known  ability  for  modeling  non-linear  system  behavior.  However  the  dynamic 
neural  network  models  are  expected  to  outperform  the  static  neural  models  for  such 
applications  due  to  their  ability  for  modeling  of  system  time  behavior.  The  output  of  those 
prediction  models  is  plugged  into  a fuzzy  logic  decision-making  system  in  order  to  locate 
the  machine  operating  regions  at  any  time  with  some  certainty.  These  methods  are  tested 
using  practical  failure  data  for  gearboxes  from  machine  diagnostic  test-bed  at  the  Applied 
Research  Laboratory.  Some  of  those  actual  failure  data  have  been  analyzed,  but  the 
prediction  models  have  not  been  yet  fully  developed  for  this  type  of  data  and 
methodology. 
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